AITopics | classification algorithm

Estimating the class prior and posterior from noisy positives and unlabeled data

Neural Information Processing SystemsMar-17-2026, 08:57:18 GMT

We develop a classification algorithm for estimating posterior distributions from positive-unlabeled data, that is robust to noise in the positive labels and effective for high-dimensional data. In recent years, several algorithms have been proposed to learn from positive-unlabeled data; however, many of these contributions remain theoretical, performing poorly on real high-dimensional data that is typically contaminated with noise. We build on this previous work to develop two practical classification algorithms that explicitly model the noise in the positive labels and utilize univariate transforms built on discriminative classifiers. We prove that these univariate transforms preserve the class prior, enabling estimation in the univariate space and avoiding kernel density estimation for high-dimensional data. The theoretical development and parametric and nonparametric algorithms proposed here constitute an important step towards wide-spread use of robust classification algorithms for positive-unlabeled data.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

822621aaa88635437ea51023afdeaec2-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 05:19:48 GMT

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)
(2 more...)

Add feedback

c88d0c9bea6230b518ce71268c8e49e0-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 21:30:50 GMT

This paper presents a text classification algorithm inspired by the notion of superposition of states in quantum physics.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Neuchâtel > Neuchâtel (0.04)
(2 more...)

Genre: Research Report (0.68)

Industry:

Leisure & Entertainment (0.47)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

6547884cea64550284728eb26b0947ef-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 16:54:21 GMT

arxiv preprint arxiv, classifier, diag, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

6547884cea64550284728eb26b0947ef-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 16:54:15 GMT

algorithm, arxiv preprint arxiv, classifier, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View

Neural Information Processing SystemsDec-24-2025, 03:03:36 GMT

Contemporary machine learning applications often involve classification tasks with many classes. Despite their extensive use, a precise understanding of the statistical properties and behavior of classification algorithms is still missing, especially in modern regimes where the number of classes is rather large. In this paper, we take a step in this direction by providing the first asymptotically precise analysis of linear multiclass classification. Our theoretical analysis allows us to precisely characterize how the test error varies over different training algorithms, data distributions, problem dimensions as well as number of classes, inter/intra class correlations and class priors. Specifically, our analysis reveals that the classification accuracy is highly distribution-dependent with different algorithms achieving optimal performance for different data distributions and/or training/features sizes. Unlike linear regression/binary classification, the test error in multiclass classification relies on intricate functions of the trained model (e.g., correlation between some of the trained weights) whose asymptotic behavior is difficult to characterize. This challenge is already present in simple classifiers, such as those minimizing a square loss. Our novel theoretical techniques allow us to overcome some of these challenges. The insights gained may pave the way for a precise understanding of other classification algorithms beyond those studied in this paper.

high-dimensional asymptotic view, multiclass classification, theoretical insight, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Real-PGDN: A Two-level Classification Method for Full-Process Recognition of Newly Registered Pornographic and Gambling Domain Names

Wang, Hao, Wang, Yingshuo, Gan, Junang, Cheng, Yanan, Zhang, Jinshuai

arXiv.org Artificial IntelligenceDec-1-2025

Online pornography and gambling have consistently posed regulatory challenges for governments, threatening both personal assets and privacy. Therefore, it is imperative to research the classification of the newly registered Pornographic and Gambling Domain Names (PGDN). However, scholarly investigation into this topic is limited. Previous efforts in PGDN classification pursue high accuracy using ideal sample data, while others employ up-to-date data from real-world scenarios but achieve lower classification accuracy. This paper introduces the Real-PGDN method, which accomplishes a complete process of timely and comprehensive real-data crawling, feature extraction with feature-missing tolerance, precise PGDN classification, and assessment of application effects in actual scenarios. Our two-level classifier, which integrates CoSENT (BERT-based), Multilayer Perceptron (MLP), and traditional classification algorithms, achieves a 97.88% precision. The research process amasses the NRD2024 dataset, which contains continuous detection information over 20 days for 1,500,000 newly registered domain names across 6 directions. Results from our case study demonstrate that this method also maintains a forecast precision of over 70% for PGDN that are delayed in usage after registration.

artificial intelligence, domain name, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2511.22215

Country: Asia > China > Heilongjiang Province (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback

Estimating the class prior and posterior from noisy positives and unlabeled data

Neural Information Processing SystemsNov-21-2025, 14:52:48 GMT

We develop a classification algorithm for estimating posterior distributions from positive-unlabeled data, that is robust to noise in the positive labels and effective for high-dimensional data. In recent years, several algorithms have been proposed to learn from positive-unlabeled data; however, many of these contributions remain theoretical, performing poorly on real high-dimensional data that is typically contaminated with noise. We build on this previous work to develop two practical classification algorithms that explicitly model the noise in the positive labels and utilize univariate transforms built on discriminative classifiers. We prove that these univariate transforms preserve the class prior, enabling estimation in the univariate space and avoiding kernel density estimation for high-dimensional data. The theoretical development and parametric and nonparametric algorithms proposed here constitute an important step towards wide-spread use of robust classification algorithms for positive-unlabeled data.

algorithm, classification algorithm, name change, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Formal Verification of Local Robustness of a Classification Algorithm for a Spatial Use Case

Longuet, Delphine, Elouazzani, Amira, Riveiros, Alejandro Penacho, Bastianello, Nicola

arXiv.org Artificial IntelligenceNov-19-2025

Failures in satellite components are costly and challenging to address, often requiring significant human and material resources. Embedding a hybrid AI-based system for fault detection directly in the satellite can greatly reduce this burden by allowing earlier detection. However, such systems must operate with extremely high reliability. To ensure this level of dependability, we employ the formal verification tool Marabou to verify the local robustness of the neural network models used in the AI-based algorithm. This tool allows us to quantify how much a model's input can be perturbed before its output behavior becomes unstable, thereby improving trustworthiness with respect to its performance under uncertainty.

artificial intelligence, histogram, machine learning, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.4204/EPTCS.436.4

2509.03948

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology: